Skip to content

Honor the console_fallback setting for engine.observed_state.logging_events send policy#2914

Merged
jmacd merged 4 commits into
open-telemetry:mainfrom
jmacd:jmacd/its-reporter-honor-console-fallback
May 12, 2026
Merged

Honor the console_fallback setting for engine.observed_state.logging_events send policy#2914
jmacd merged 4 commits into
open-telemetry:mainfrom
jmacd:jmacd/its-reporter-honor-console-fallback

Conversation

@jmacd
Copy link
Copy Markdown
Contributor

@jmacd jmacd commented May 9, 2026

Change Summary

console_fallback: false

would not disable telemetry from logging calls that overflow, due to a default setting.

The correct field engine.observed_state.logging_events.

Fixes #2916

@jmacd jmacd requested a review from a team as a code owner May 9, 2026 01:16
@github-actions github-actions Bot added the rust Pull requests that update Rust code label May 9, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 9, 2026

Codecov Report

❌ Patch coverage is 96.77419% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.26%. Comparing base (3350e33) to head (93537fc).
⚠️ Report is 8 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2914   +/-   ##
=======================================
  Coverage   86.26%   86.26%           
=======================================
  Files         715      715           
  Lines      272060   272123   +63     
=======================================
+ Hits       234700   234756   +56     
- Misses      36836    36843    +7     
  Partials      524      524           
Components Coverage Δ
otap-dataflow 87.23% <96.77%> (+<0.01%) ⬆️
query_abstraction 80.61% <ø> (ø)
query_engine 90.73% <ø> (ø)
otel-arrow-go 52.45% <ø> (ø)
quiver 92.25% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

jmacd and others added 3 commits May 9, 2026 17:05
…reporter

The InternalTelemetrySystem ITS reporter was hard-coded to use
SendPolicy::default(), which has console_fallback: true. Under load
this can produce hundreds of thousands of 'Channel full, dropping
observed event' raw_error! lines on stderr per second when the ITS
flume channel saturates, which:
  - drowns out useful diagnostic output
  - silently dominates /proc/<pid>/io wchar so any wire-bytes
    measurement based on that counter is contaminated
  - costs measurable CPU writing those messages

The engine config already has the right knob,
engine.observed_state.logging_events, with default
console_fallback: false for the logging path (only the engine_events
path defaults to true). It just was not being plumbed to the ITS
reporter.

Add an its_reporter_policy: SendPolicy parameter to
InternalTelemetrySystem::new and use it for both reporter
constructions in the ITS branch (with and without log tap). Pass
engine.observed_state.logging_events.clone() from the controller
startup. All test call sites updated to pass SendPolicy::default()
to preserve their existing behaviour.

Verified with cargo test -p otap-df-telemetry --lib (126 passed).
Verified end-to-end: a logger configured with global: its at 10k/s
and a downstream that backpressures (forcing channel-full drops at
~50%) now produces 0 'Channel full' lines vs the prior 150k lines
in 30s, and process CPU drops correspondingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jmacd jmacd force-pushed the jmacd/its-reporter-honor-console-fallback branch from e66bafd to 1f142c8 Compare May 9, 2026 17:28
let (sender, logs_receiver) = flume::bounded(config.reporting_channel_size);
let reporter = if let Some(log_tap) = &log_tap_handle {
ObservedEventReporter::new(SendPolicy::default(), sender)
ObservedEventReporter::new(logging_send_policy.clone(), sender)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - Could we add one regression test for this path with ProviderMode::ITS and console_fallback: false?

@jmacd jmacd marked this pull request as draft May 11, 2026 17:53
…policy

Covers both code paths in InternalTelemetrySystem::new where the ITS
reporter is constructed (with and without an InternalLogTapHandle),
verifying that the configured engine.observed_state.logging_events
SendPolicy is threaded through rather than replaced with
SendPolicy::default().

Adds test-only accessors:
- ObservedEventReporter::policy()
- InternalTelemetrySystem::its_reporter()

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jmacd jmacd marked this pull request as ready for review May 12, 2026 17:04
@jmacd jmacd added this pull request to the merge queue May 12, 2026
Merged via the queue into open-telemetry:main with commit 3e94f1f May 12, 2026
85 checks passed
@jmacd jmacd deleted the jmacd/its-reporter-honor-console-fallback branch May 12, 2026 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Console logging fallback in ITS mode not honored

2 participants